Goto

Collaborating Authors

 general objective function


Planning with General Objective Functions: Going Beyond Total Rewards

Neural Information Processing Systems

Standard sequential decision-making paradigms aim to maximize the cumulative reward when interacting with the unknown environment., i.e., maximize \sum_{h 1} H r_h where H is the planning horizon. However, this paradigm fails to model important practical applications, e.g., safe control that aims to maximize the lowest reward, i.e., maximize \min_{h 1} H r_h . In this paper, based on techniques in sketching algorithms, we propose a novel planning algorithm in deterministic systems which deals with a large class of objective functions of the form f(r_1, r_2, ... r_H) that are of interest to practical applications. We show that efficient planning is possible if f is symmetric under permutation of coordinates and satisfies certain technical conditions. Complementing our algorithm, we further prove that removing any of the conditions will make the problem intractable in the worst case and thus demonstrate the necessity of our conditions.


Review for NeurIPS paper: Planning with General Objective Functions: Going Beyond Total Rewards

Neural Information Processing Systems

Additional Feedback: Post feedback response: I appreciate the author feedback. One item I want to flag, though. The feedback said (of one of the reviews): "We are grateful to the reviewer for providing a comprehensive list of papers on non-Markovian reward". I do not think the list is at all "comprehensive". It represents a number of very relevant and very significant papers, but there are others in this area.


Planning with General Objective Functions: Going Beyond Total Rewards

Neural Information Processing Systems

Standard sequential decision-making paradigms aim to maximize the cumulative reward when interacting with the unknown environment., i.e., maximize \sum_{h 1} H r_h where H is the planning horizon. However, this paradigm fails to model important practical applications, e.g., safe control that aims to maximize the lowest reward, i.e., maximize \min_{h 1} H r_h . In this paper, based on techniques in sketching algorithms, we propose a novel planning algorithm in deterministic systems which deals with a large class of objective functions of the form f(r_1, r_2, ... r_H) that are of interest to practical applications. We show that efficient planning is possible if f is symmetric under permutation of coordinates and satisfies certain technical conditions. Complementing our algorithm, we further prove that removing any of the conditions will make the problem intractable in the worst case and thus demonstrate the necessity of our conditions.


Error analysis of generative adversarial network

Hasan, Mahmud, Sang, Hailin

arXiv.org Machine Learning

The generative adversarial network (GAN) is an important model developed for high-dimensional distribution learning in recent years. However, there is a pressing need for a comprehensive method to understand its error convergence rate. In this research, we focus on studying the error convergence rate of the GAN model that is based on a class of functions encompassing the discriminator and generator neural networks. These functions are VC type with bounded envelope function under our assumptions, enabling the application of the Talagrand inequality. By employing the Talagrand inequality and Borel-Cantelli lemma, we establish a tight convergence rate for the error of GAN. This method can also be applied on existing error estimations of GAN and yields improved convergence rates. In particular, the error defined with the neural network distance is a special case error in our definition.